Automatic non linear metric learning : Application to gesture recognition
نویسندگان
چکیده
As consumer devices become more and more ubiquitous, new interaction solutions are required. In this thesis, we explore inertial-based gesture recognition on Smartphones, where gestures holding a semantic value are drawn in the air with the device in hand. Based on accelerometer and gyrometer data, three main approaches exist in the literature. The earliest methods suggest to model the temporal structure of a gesture class, with Hidden Markov Models for example; while another approach consists in matching gestures with reference instances, using a non-linear distance measure generally based on Dynamic Time Warping. Finally, features can be extracted from gesture signals in order to train specific classifiers, such as Support Vector Machines. In our research, speed and delay constraints required by an application are critical, leading us to the choice of neural-based models. While Bi-Directional Long Short-Term Memory and Convolutional neural networks have already been investigated, the main issue is to tackle an open-world problem, which does not only require a good classification performance but, above all, an excellent capability to reject unknown classes. Thus, our work focuses on metric learning between gesture sample signatures using the "Siamese" architecture (Siamese Neural Network, SNN), which aims at modelling semantic relations between classes to extract discriminative features, applied to the MultiLayer Perceptron. Contrary to some popular versions of this algorithm, we opt for a strategy that does not require additional parameter fine tuning, namely a set threshold on dissimilar outputs, during training. Indeed, after a preprocessing step where the data is filtered and normalised spatially and temporally, the SNN is trained from sets of samples, composed of similar and dissimilar examples, to compute a higher-level representation of the gesture, where features are collinear for similar gestures, and orthogonal for dissimilar ones. While the original model already works for classification, multiple mathematical problems which can impair its learning capabilities are identified. Consequently, as opposed to the classical similar or dissimilar pair; or reference, similar and dissimilar sample triplet input set selection strategies, we propose to include samples from every available dissimilar classes, resulting in a better structuring of the output space. Moreover, we apply a regularisation on the outputs to better determine the objective function. Furthermore, the notion of polar sine enables a redefinition of the angular problem by maximising a normalised volume induced by the outputs of the Cette thèse est accessible à l'adresse : http://theses.insa-lyon.fr/publication/2016LYSEI014/these.pdf © [S.C. Berlemont], [2016], INSA Lyon, tous droits réservés
منابع مشابه
Automatic Non Linear Metric Learning - Application to Gesture Recognition. (Apprentissage automatique de métrique non linéaire - Application à la reconnaissance de gestes)
As consumer devices become more and more ubiquitous, new interaction solutions are required. In this thesis, we explore inertial-based gesture recognition on Smartphones, where gestures holding a semantic value are drawn in the air with the device in hand. Based on accelerometer and gyrometer data, three main approaches exist in the literature. The earliest methods suggest to model the temporal...
متن کاملFuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition
In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...
متن کاملHand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study
Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...
متن کاملیادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیکهای یادگیری معیار فاصله
Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...
متن کامل